Using resource timing data for server push in multiple web page transactions

ABSTRACT

This patent document describes, among other things, methods and systems for determining which if any page resources a server might push to a client (using, e.g., an HTTP 2.0 server push mechanism). The approaches described herein improve web page load times by pushing page resources that a client is likely to need to render the base page, while reducing wasteful server pushes of resources that the client is unlikely to request from the server because, for example, they are already cached at the client.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.15/011,409, filed Jan. 29, 2016, which is based on and claims thebenefit of priority of: U.S. Application No. 62/110,413, filed Jan. 30,2015, and U.S. Application No. 62/110,416, filed Jan. 30, 2015, and ofU.S. Application No. 62/110,418, filed Jan. 30, 2015, the contents ofall of which are hereby incorporated by reference in their entireties.

BACKGROUND Technical Field

This application relates generally to distributed data processingsystems and to the delivery of content to users over computer networks.

Brief Description of the Related Art

Distributed computer systems are known in the art. One such distributedcomputer system is a “content delivery network” or “CDN” that isoperated and managed by a service provider. The service providertypically provides the content delivery service on behalf of thirdparties. A “distributed system” of this type typically refers to acollection of autonomous computers linked by a network or networks,together with the software, systems, protocols and techniques designedto facilitate various services, such as content delivery or the supportof outsourced site infrastructure. This infrastructure is shared bymultiple tenants, the content providers. The infrastructure is generallyused for the storage, caching, or transmission of content—such as webpages, streaming media and applications—on behalf of such contentproviders or other tenants. The platform may also provide ancillarytechnologies used therewith including, without limitation, DNS queryhandling, provisioning, data monitoring and reporting, contenttargeting, personalization, and business intelligence.

In a known system such as that shown in FIG. 1, a distributed computersystem 100 is configured as a content delivery network (CDN) and has aset of servers 102 distributed around the Internet. Typically, most ofthe servers are located near the edge of the Internet, i.e., at oradjacent end user access networks. A network operations command center(NOCC) 104 may be used to administer and manage operations of thevarious machines in the system. Third party sites affiliated withcontent providers, such as web site 106, offload delivery of content(e.g., HTML or other markup language files, embedded page objects,streaming media, software downloads, and the like) to the distributedcomputer system 100 and, in particular, to the CDN servers (which aresometimes referred to as content servers, or sometimes as “edge” serversin light of the possibility that they are near an “edge” of theInternet). Such CDN servers 102 may be grouped together into a point ofpresence (POP) 107 at a particular geographic location.

The CDN servers are typically located at nodes that arepublicly-routable on the Internet, in end-user access networks, peeringpoints, within or adjacent nodes that are located in mobile networks, inor adjacent enterprise-based private networks, or in any combinationthereof.

Typically, content providers offload their content delivery by aliasing(e.g., by a DNS CNAME) given content provider domains or sub-domains todomains that are managed by the service provider's authoritative domainname service. The server provider's domain name service directs end userclient machines 122 that desire content to the distributed computersystem (or more particularly, to one of the CDN servers in the platform)to obtain the content more reliably and efficiently. The CDN serversrespond to the client requests, for example by fetching requestedcontent from a local cache, from another CDN server, from an originserver 106 associated with the content provider, or other source, andsending it to the requesting client.

For cacheable content, CDN servers typically employ a caching model thatrelies on setting a time-to-live (TTL) for each cacheable object. Afterit is fetched, the object may be stored locally at a given CDN serveruntil the TTL expires, at which time is typically re-validated orrefreshed from the origin server 106. For non-cacheable objects(sometimes referred to as ‘dynamic’ content), the CDN server typicallyreturns to the origin server 106 when the object is requested by aclient. The CDN may operate a server cache hierarchy to provideintermediate caching of customer content in various CDN servers that arebetween the CDN server handling a client request and the origin server106; one such cache hierarchy subsystem is described in U.S. Pat. No.7,376,716, the disclosure of which is incorporated herein by reference.

Although not shown in detail in FIG. 1, the distributed computer systemmay also include other infrastructure, such as a distributed datacollection system 108 that collects usage and other data from the CDNservers, aggregates that data across a region or set of regions, andpasses that data to other back-end systems 110, 112, 114 and 116 tofacilitate monitoring, logging, alerts, billing, management and otheroperational and administrative functions. Distributed network agents 118monitor the network as well as the server loads and provide network,traffic and load data to a DNS query handling mechanism 115. Adistributed data transport mechanism 120 may be used to distributecontrol information (e.g., metadata to manage content, to facilitateload balancing, and the like) to the CDN servers. The CDN may include anetwork storage subsystem (sometimes referred to herein as “NetStorage”)which may be located in a network datacenter accessible to the CDNservers and which may act as a source of content, such as described inU.S. Pat. No. 7,472,178, the disclosure of which is incorporated hereinby reference.

As illustrated in FIG. 2, a given machine 200 in the CDN comprisescommodity hardware (e.g., a microprocessor) 202 running an operatingsystem kernel (such as Linux® or variant) 204 that supports one or moreapplications 206 a-n. To facilitate content delivery services, forexample, given machines typically run a set of applications, such as anHTTP proxy 207, a name service 208, a local monitoring process 210, adistributed data collection process 212, and the like. The HTTP proxy207 (sometimes referred to herein as a global host or “ghost”) typicallyincludes a manager process for managing a cache and delivery of contentfrom the machine. For streaming media, the machine may include one ormore media servers, such as a Windows® Media Server (WMS) or Flashserver, as required by the supported media formats.

A given CDN server 102 seen in FIG. 1 may be configured to provide oneor more extended content delivery features, preferably on adomain-specific, content-provider—specific basis, preferably usingconfiguration files that are distributed to the CDN servers using aconfiguration system. A given configuration file preferably is XML-basedand includes a set of content handling rules and directives thatfacilitate one or more advanced content handling features. Theconfiguration file may be delivered to the CDN server via the datatransport mechanism. U.S. Pat. Nos. 7,240,100, the contents of which arehereby incorporated by reference, describe a useful infrastructure fordelivering and managing CDN server content control information, and thisand other control information (sometimes referred to as “metadata”) canbe provisioned by the CDN service provider itself, or (via an extranetor the like) the content provider customer who operates the originserver. U.S. Pat. No. 7,111,057, incorporated herein by reference,describes an architecture for purging content from the CDN. Moreinformation about a CDN platform can be found in U.S. Pat. Nos.6,108,703 and 7,596,619, the teachings of which are hereby incorporatedby reference in their entirety.

In a typical operation, a content provider identifies a content providerdomain or sub-domain that it desires to have served by the CDN. When aDNS query to the content provider domain or sub-domain is received atthe content provider's domain name servers, those servers respond byreturning the CDN hostname (e.g., via a canonical name, or CNAME, orother aliasing technique). That network hostname points to the CDN, andthat hostname is then resolved through the CDN name service. To thatend, the CDN name service returns one or more IP addresses. Therequesting client application (e.g., browser) then makes a contentrequest (e.g., via HTTP or HTTPS) to a CDN server machine associatedwith the IP address. The request includes a host header that includesthe original content provider domain or sub-domain. Upon receipt of therequest with the host header, the CDN server checks its configurationfile to determine whether the content domain or sub-domain requested isactually being handled by the CDN. If so, the CDN server applies itscontent handling rules and directives for that domain or sub-domain asspecified in the configuration. These content handling rules anddirectives may be located within an XML-based “metadata” configurationfile, as mentioned previously.

The CDN platform may be considered an overlay across the Internet onwhich communication efficiency can be improved. Improved communicationstechniques on the overlay can help when a CDN server needs to obtaincontent from origin server 106, or otherwise when acceleratingnon-cacheable content for a content provider customer. Communicationsbetween CDN servers and/or across the overlay may be enhanced orimproved using improved route selection, protocol optimizationsincluding TCP enhancements, persistent connection reuse and pooling,content & header compression and de-duplication, and other techniquessuch as those described in U.S. Pat. Nos. 6,820,133, 7,274,658,7,607,062, and 7,660,296, among others, the disclosures of which areincorporated herein by reference.

As an overlay offering communication enhancements and acceleration, theCDN server resources may be used to facilitate wide area network (WAN)acceleration services between enterprise data centers and/or betweenbranch-headquarter offices (which may be privately managed), as well asto/from third party software-as-a-service (SaaS) providers used by theenterprise users.

In this vein CDN customers may subscribe to a “behind the firewall”managed service product to accelerate Intranet web applications that arehosted behind the customer's enterprise firewall, as well as toaccelerate web applications that bridge between their users behind thefirewall to an application hosted in the Internet cloud (e.g., from aSaaS provider).

To accomplish these two use cases, CDN software may execute on machines(potentially in virtual machines running on customer hardware) hosted inone or more customer data centers, and on machines hosted in remote“branch offices.” The CDN software executing in the customer data centertypically provides service configuration, service management, servicereporting, remote management access, customer SSL/TLS certificatemanagement, as well as other functions for configured web applications.The software executing in the branch offices provides last mile webacceleration for users located there. The CDN itself typically providesCDN hardware hosted in CDN data centers to provide a gateway between thenodes running behind the customer firewall and the CDN serviceprovider's other infrastructure (e.g., network and operationsfacilities). This type of managed solution provides an enterprise withthe opportunity to take advantage of CDN technologies with respect totheir company's intranet, providing a wide-area-network optimizationsolution. This kind of solution extends acceleration for the enterpriseto applications served anywhere on the Internet. By bridging anenterprise's CDN-based private overlay network with the existing CDNpublic internet overlay network, an end user at a remote branch officeobtains an accelerated application end-to-end. FIG. 3 illustrates ageneral architecture for a WAN optimized, “behind-the-firewall” serviceoffering such as that described above. Information about a behind thefirewall service offering can be found in teachings of U.S. Pat. No.7,600,025, the teachings of which are hereby incorporated by reference.

For live streaming delivery, the CDN may include a live deliverysubsystem, such as described in U.S. Pat. No. 7,296,082, and U.S.Publication Nos. 2011/0173345 and 2012/0265853, the disclosures of whichare incorporated herein by reference.

Turning to the topic of network protocols, the Hypertext TransferProtocol (HTTP) is a well-known application layer protocol in the art.It is often used for transporting HTML documents that define thepresentation of web pages, as well as embedded resources associated withsuch pages. The HTTP 1.0 and 1.1 standards came about in the 1990s.Recently, HTTP 2.0, a major revision to HTTP, has been approved forstandards track consideration by the IETF (RFC 7540). The HTTP 2.0proposed standard has been in development for some time (see, e.g., HTTPversion 2, working draft, draft-ietf-httpbis-http2-16, Nov. 29, 2014).According to that working draft and RFC 7540, HTTP 2.0 enables efficientuse of network resources and a reduced perception of latency byintroducing header field compression and allowing multiple concurrentmessages on the same connection. It also introduces unsolicited push ofrepresentations from servers to clients. HTTP 2.0 is based on an earlierprotocol, SPDY, which also offered an unsolicited push feature.

Server push features present the opportunity for increased efficiencies,but must be used wisely. For example, it is known in the art to predictresources that a client may request, given an initial request (e.g., fora base HTML page). A variety of prediction algorithms are known the art,including the prefetching approaches described in U.S. Pat. No.8,447,837, US Patent Publication No. 2014/0379840, US Patent PublicationNo. 2015/0089352, and US Patent Publication No. 2015/0120821, thecontents of all of which are hereby incorporated by reference.

It is also known in the art to use predictions to push resources to aclient using the push mechanism contemplated in SPDY and HTTP 2.0.Pushing content to the client can result in wasted bandwidth if theprediction is wrong, or if the client already has the resource in aclient-side cache. To address this issue, it has been proposed in theprior art that the hint mechanism of SPDY could be used to search thebrowser's cache to ensure that already-cached resources are notre-fetched by the proxy. (See, e.g., Nicholas Armstrong, Just in TimePush Prefetching: Accelerating the Mobile Web, University of WaterlooMaster's Thesis, 2011.) Further, Uzonov (Andrey Uzonov, Speeding Up TorWith SPDY, Master's Thesis, Munich Technical University, 2013) proposescollecting statistical data about resource requests for a page, and forsubsequent page requests, pushing resources when his proposedalgorithm(s) are confident enough that they would be requested in thepage load. The algorithms described by Uzonov take into account thefrequency with which a resource is requested overall, or for aparticular page load, as well as the number of times that a resource hasbeen seen after the first page load in a session, or in prior pageloads. Several algorithms are proposed. Uzonov investigates the use of acost function for pushing resources that accounts for hits and mistakes.Uzonov also proposes, among other things, considering the device type orbrowser type (user-agent) in determining whether to push assets, settinga maximum asset size for push, and keeping track of the assets that haveprovided to the client previously (at the server or at the client) toavoid re-sending them.

While the foregoing approaches are valuable, there remains a need forimproved approaches that intelligently determine those objects a servershould push to a client, and those objects a server should not push,when leveraging a push mechanism such as that provided HTTP 2.0. Theteachings hereof are not necessarily limited to HTTP 2.0, but apply toany mechanism for pushing web page components from a server to a client.

The teachings hereof can be used to improve the efficiency of web pageloading and of network usage, among other things.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be more fully understood from the following detaileddescription taken in conjunction with the accompanying drawings, inwhich:

FIG. 1 is a schematic diagram illustrating one embodiment of a knowndistributed computer system configured as a content delivery network;

FIG. 2 is a schematic diagram illustrating one embodiment of a machineon which a CDN server in the system of FIG. 1 can be implemented;

FIG. 3 is a schematic diagram illustrating one embodiment of a generalarchitecture for a WAN optimized, “behind-the-firewall” serviceoffering;

FIG. 4 is a waterfall chart; and,

FIG. 5 is a block diagram illustrating hardware in a computer systemthat may be used to implement the teachings hereof.

DETAILED DESCRIPTION

The following description sets forth embodiments of the invention toprovide an overall understanding of the principles of the structure,function, manufacture, and use of the methods and apparatus disclosedherein. The systems, methods and apparatus described herein andillustrated in the accompanying drawings are non-limiting examples; theclaims alone define the scope of protection that is sought. It iscontemplated that implementations of the teachings hereof will vary withdesign goals, performance desires and later developments, withoutdeparting from the teachings hereof. The features described orillustrated in connection with one exemplary embodiment may be combinedwith the features of other embodiments. Such modifications andvariations are intended to be included within the scope of the presentinvention. All patents, publications and references cited herein areexpressly incorporated herein by reference in their entirety.

Throughout this disclosure, the term “e.g.” is used as an abbreviationfor the non-limiting phrase “for example.” Basic familiarity withwell-known web page and networking technologies and terms, such as HTML,URL, XML, AJAX, CSS, HTTP, and TCP/IP, is assumed. In this disclosure,the terms page ‘object’ and page ‘resource’ are used interchangeablywith no intended difference in meaning. The term base page is usedherein to refer to page defined by an associated markup languagedocument (e.g., HTML) that references one or more embedded resources(e.g., images, CSS, Javascript, or other types), as known in the art.

Overview

As mentioned above, HTTP 2.0 offers a server-push facility. Typically, aclient will open a connection to a server and request a base HTMLdocument for a web page. Typically this base page is not cacheable in aproxy cache server such as those described above (e.g., CDN server 102),and so the proxy server will go forward to origin. According to thisdisclosure, while waiting for the base HTML of the page to arrive, theproxy server can push objects down to the client via the Push-Promisemechanism of HTTP 2.0. There is also the possibility of pushing contentafter a particular page load completes. It is expected that aperformance benefit of server-push can be achieved by taking advantageof this “dead time” on the wire.

Resource Timing data can be used to provide a feed of information thatinforms the server what to push to the client. In other words, theseresources can be prefetched and then pushed by the server. This patentdisclosure discusses two specific cases: pushing content to the clientto try to speed up only the page that is currently loading, and pushingcontent to the client to try to speed up some multi-page transactionthat the client is executing.

Pre-pushing content can improve performance, but it needs to be usedwith caution. Pushing an object that is not used by the client is wastedeffort, potentially uses up bytes against an end-user's cellulardownload limit, potentially increases congestion on the clientconnection, and can potentially displace more useful resources from theend-user's browser cache.

Resource Timing Data

The Resource Timing API is known the art and is implemented in mostbrowsers. It allows Javascript to extract detailed timing data for eachembedded object fetched during the load of a base page. A CDN can gatherand beacon back the full suite of Resource Timing (RT) data on a desiredpercentage (e.g., 1 to 5 percent, or a few percent, or other value) ofall page loads of content providers who have enabled the functionality.From the RT data it is possible to reconstruct a waterfall chart showingall the resources that loaded during the page load, and the timings ofeach. More description of RT data and how it can be collected by a CDNplatform is available in US Patent Publication No. 2013/0166634 A1, thecontents of which are hereby incorporated by reference in theirentirety.

FIG. 4 is a waterfall chart that was reconstructed from real ResourceTiming Data collected using a real-user-monitoring system of the kinddescribed in US Patent Publication No. 2013/0166634 A1. It is a partialview that shows load times for some of the resources on the page. TheURL and resource names are genericized.

As noted, other resources loaded on the particular page, but the partialview of FIG. 4 illustrates a few points:

-   -   1. The base page HTML, page.html, took 446 ms to load, and        nothing else was happening on the wire while it was being        fetched.    -   2. Page resources 2 to 23 took little or no time to load,        indicating that they were very likely browser-cache hits. This        is not definite in all cases, but it's very likely.    -   3. Several objects immediately below page-resource-23 took        multiple hundreds of milliseconds to load.

From this analysis, one can conclude that the simple approach of pushingthe first N objects on a page is probably not optimal. The first Nobjects are likely JS (Javascript), CSS (cascading style sheets), andimages that always appear on the page. As such, a browser will generallyhave them cached, and so pushing them will not speed up the page load atall. As known the art, HTTP 2.0 offers a ‘cancel’ mechanism whereby theclient can cancel a push-promise received from the server, but thepush/cancel cycle is likely to be slow and will eat up the limited timeavailable for pushing useful objects. It would be much better in theabove example to start by pushing page-resource-25 and not waste time onthe earlier resources.

Single-Page Proposal

Since a system can collect RT data on a percentage of all page views,one can get many samples for the same base page over the course of a dayor a week. Further, the sampling rate can be controlled to get as manyor as few samples as we desire. Given this, it becomes possible toreliably extract several independent metrics from the data:

-   -   freq=percentage of samples showing that a particular resource        was loaded during the load of a specific base page. If, for        example, there are 100 RT records in the database for a        particular base page, and 23 of them show an entry for some        particular embedded object, then freq for that        embedded-object/base-page pair is 23%.    -   lat=median amount of time it took to load a particular resource.        Again, this metric is more formally defined as applying to an        object/base-page pair rather than to an object in isolation.    -   seq=index in the fetch order (how many resources were fetched        prior to the one in question). This metric can vary across        samples on the same base page due to random timing delays, so we        can use a median value computed across observed samples for a        given object/base-page pair.    -   The type of object (image, CSS, JS, etc.) can in most cases be        inferred from the file name (e.g., the extension), and where        this is not possible an external data-collection utility can        gather up the object-type data (the value of the Content-Type        header) by doing HEAD or GET requests against specific embedded        objects that were identified by RT data.

The product freq*lat can be used to give an indication of how impactfulto overall page load time any given resource is. If freq is low, theresource is rarely loaded on the page, and so it's probably not usefulto push it. Similarly, if lat is low, the resource is likely oftenpresent in the browser cache, so again there's unlikely to be a benefitto pushing it. A process can compute the resources with the highestfreq*lat value and publish the results to a table. A subset of theresources can be selected, e.g., by omitting those with values less thansome threshold. For example, in one embodiment, the freq threshold mightbe 40% and the lat threshold might be 20 ms. In other embodiments, thefreq might be in a range of 40-70%, and the lat might be in a range of10-25 ms. In yet another embodiment, the threshold can be 40*30=1200.The threshold can be set on the product (e.g. 40*30, or 40*20, per theprior embodiment) rather than individually on the components (e.g. 40%and 30 ms) in order to focus on overall impact to page load time. Forexample, a resource with freq=20% and lat=100 ms would be a candidate inthis scheme, whereas it would not be if the threshold were appliedindividually to freq and lat. Approximate ties can be broken byprioritizing resources with lower seq values and/or by preferringcertain object types, such as those that represent CSS or Javascript.(More advanced thresholding can also be applied, for example, applyingthresholds on freq and lat in addition to their product. As an example,any resource requiring 300 ms or more to load might be automaticallyconsidered for push, even if often cached, because the penalty for nothaving it is so high.) In general one can define some functionf(freq,lat,seq,object-is-css-or-js), and rank objects to pre-pushaccording to the value of the function. In one embodiment, the functioncan be:

Score=freq*lat−K1*seq+K2*(object-is-css-or-js)

Reasonable values for K1 and K2 might be 1 and 10, for example. In otherembodiments values might range between K1=1 to 100 and K2=1 to 100,which would allow swinging the weights of the latter two terms up into arange where they actively compete with the first term in most cases. (Aweighting—that is, a coefficient—could also be applied to the freq*latproduct, in some embodiments.) In this case, if freq=100% and lat=100ms, then the leftmost term is 10,000 and dominates the rightmost termsno matter what their value. But if freq=10% and lat=10 ms, then theleftmost term is only 100, and the other two terms have significantimpact on the results. A proposed implementation might fix these valuesacross all sites (e.g., as defined by hostname), or might allow them tovary from site to site, trying to tune them for optimal performance.

Given the above function, the offline utility (e.g., a computer machinedistinct from the CDN proxy servers, such as the back-end system 308and/or visualization system 310 shown in US Patent Publication No.2013/0166634 A1) can compute the score for each embedded resource thatis on a hostname that is configured to be allowed to be pushed on thebase page connection. The HTTP 2.0 specification provides the specificconditions under which an object is a candidate for server-push. Forexample, the HTTP 2.0 specification indicates that a server thatprovides a pushed response should be configured for the correspondingrequest. It states that “A server that offers a certificate only for‘example.com . . . is not permitted to push a response for‘https://www.example.org/doc.’” See RFC 7540, Sec. 8.2.2.

Given the list of candidate objects and the computed scores, the offlineutility can identify the resources with the highest scores, apply somethresholding as described above, and publish a table to a proxy server,such as CDN servers described above. The proxy server can then push oneor more of these resources, in scoring order, upon receiving a clientrequest for the base page associated with the table.

The HTTP proxy application on the server will need to store the tablesdescribing which objects to pre-push on each base page. It may not haveenough memory to do this for a large number of base pages, so it cantrack these tables only for the base pages it is serving most often.Alternatively, with some loss of fidelity, the offline process thatcomputes the tables can generate a single table that covers some classof base pages within a given content provider customer.

Finally, it's often the case that a customer publishes different basepages to different groups of end-users. The selection might be based onthe device type (hand-held versus tablet versus laptop/desktop), theend-user's language, user-agent, the end-user's geography, or any of ahost of other variables. A customer could communicate these distinctionsto the CDN, and the offline utility could compute one table in each ofthese categories for any given base page. For example, a pre-push tablefor base-page.html for mobile devices, a pre-push table forbase-page.html for desktop devices, etc. Targeting the pre-push table toknown variants on the base page can significantly improve the pre-pushaccuracy. More detail about this is provided in the next section.

Clustering For Page Variants

With more work, variants can be computed from the RT data via aclustering algorithm. In one possible approach, the utility would knowin advance about a large number of factors that are commonly used todeliver differing base pages, and would compute a table of push objectsfor each (e.g., a table computed from the RT data that was collectedfrom page loads on mobile devices, or table computed from the RT datathat was collected from page loads in a particular geography). Thetables that end up with the highest average scoring function would verylikely be the ones that most closely match the content providercustomer's actual base page variants (e.g., the conclusion being thatcontent provider customer provides a specific page for mobile devices,or for a particular geography). The offline utility could then publishboth the table and the basis for computing it (geography, language,cookie value, etc.), and thereby instruct the HTTP proxy in the serverabout the set of variants to consider.

In one possible implementation, an analysis engine (e.g., implemented bycomputer machine) might separate the set of beacons (and thereby the RTdata) for a given website into a number of distinct partitions. (See, USPatent Publication No. 2013/0166634 A1, incorporated herein byreference, for a description of the beaconing process and in particularthe beacon data types in section 3.1.3, which include User Agentstring.) Each partition can correspond to a factor that the originserver might use to deliver differing content to differing bodies of endusers. So, for example, the engine might separate the set of beacons forone web site (e.g., as keyed by hostname) into two partitions: thosethat came from mobile devices and those that did not. It might use theUser Agent string to make this determination. As known in the art, aUser Agent string contains information about the system and browser of aclient, which can reveal whether the device is a mobile device (e.g.,iOS, iPad, etc.)

Once the beacons have been partitioned in this way, the engine can runthe above-described push-candidate-selection algorithm over each of thetwo partitions. If the two sets of push candidates are significantlydifferent, the engine can conclude that the origin server does indeeddeliver different content to mobile devices than it does to non-mobiledevices. One approach for determining whether the differences aresignificant would be to look for fraction of resources that are notshared/common in two partitions. For example, if the common resourcesare less than a certain threshold (for e.g. 50%, or 70%, or somewhere inbetween), the system can regard the two partitions to be different. Itwould thereby know that in the future it should apply this partitioningwhen computing the actual push lists to be used in production. If,however, the two push lists do not differ significantly, the enginewould conclude that the origin does no such selection of content, andtherefore that the mobile-versus-non-mobile question can be ignored forproduction use. Then, the engine might repeat this process for otherfactors: the geographic location of the client (perhaps based on thelanguage spoken in the country of the client machine or a mapping of IPto geography), the type of hardware the client is using (desktop,laptop, handheld), the estimated connectivity speed of the client (e.g.dial-up, DSL, Cable/Fios®, which can be obtained from commercialgeo-location services), and possibly other factors. EdgeScape® fromAkamai Technologies Inc. is one commercial service that provides suchkinds of information. In general such information can be obtained byhaving a server(s) record statistics such as round trip time, size andtransfer time for requests, and then analyzing this data offline togenerate historical throughput and keying off of, e.g., IP address. TheIP address can also be characterized into connectivity type based on,e.g., knowledge of an AS and/or throughput to the IP. The engine canthen repeat this end-to-end analysis daily or weekly, so as to trackchanges to the origin behavior with some reasonable latency.

Having completed the above analyses for all each factor in question, theengine can perform a final partition based on all factors which itdetermined the origin to be sensitive. For example, if the origin isdetermined to be sensitive to both hardware type and connection speed,the engine can partition the beacons into the following nine sets: (1)desktop machines on dialup, (2) laptop machines dialup, (3) handheld ondialup, (4) desktops on DSL, (5) laptops on DSL, (6) handhelds on DSL,(7) desktops on cable/Fios®, (8) laptops on cable/Fios®, (9) handheldson cable/Fios®. It can then send the resultant nine distinct set of pushlists to the edge servers, along with instructions for the specificconditions (e.g., the above factors) under which that edge server shoulduse each list. If the number of distinct partitions becomes too large tomanage, the engine can successively merge the most-similar sets of pushlists in order to reduce it.

Transaction Proposal

The above ideas can be extended to the problem of trying to speed up amulti-page web transaction (in other words, a web navigation sequence).For example, it is very common in the eCommerce world to define atransaction of the form:

-   -   1. Visit some landing page, like www.customer.com    -   2. Enter a search term into a box on that page, and thereby be        directed to a search results page.    -   3. Click a link on the search-results page and thereby be        directed to a product page.    -   4. Click the add-to-cart link. This might take you to a new        page, or might just submit data to the server, updating some        field on the current page from Javascript but not actually        visiting any new page.    -   5. Click on the checkout link and thereby be directed to a        checkout page.    -   6. Sequence through one or more checkout pages to complete the        transaction

Depending on the design of the site in question, the number of pagesvisited in such a transaction might vary; a typical amount might bebetween 2 and about 10.

This raises the possibility of choosing objects to push to the clientbased on the expected cumulative benefit to the end-to-end transaction,as opposed to the expected benefit for the single page-fetch that is inprogress. It also raises the possibility of pushing objects to theclient during idle time that occurs while the end-user is reading and/orinteracting with the web page, which can substantially expand the numberof objects that might be pushed.

In order to accomplish this, the server needs to (1) recognize that aparticular end-user is engaged in a particular transaction, (2) identifythe expected steps in that transaction prior to the time the end-userhas fully executed it, and (3) identify the content that is most likelyto be beneficial for performance on each step of the transaction, and(4) identify the available time windows in which to push. Accomplishingthis while maintaining a low rate of wasted pre-push is a significantchallenge.

Identifying Time Windows in Which to Push

The discussion starts with task (4). A server can immediately open apush time window when it receives a request for a base page HTMLdocument that it has to forward to the origin server. It can close thatpush window when the base page arrives from the origin. It can re-openthe push window when the connection from the end-user has gone idle andremained idle for N milliseconds. N can be something fixed like 200 msor in a range of 100-500 ms (to make it probable that the delay is notjust rendering or Javascript execution time on the client), or it couldbe derived from the RT data by examining the duration of any dead timethat commonly occurs during the fetch of the base page in question.

Identifying Objects to Push

Moving on to task 3, assuming a server knows the expected transactionsteps (where steps herein mean pages, i.e., step 1 is page 1 in thesequence, etc.), it can look up the object score tables for each basepage in the transaction, and identify the objects with the largestscores across the whole series. Having identified these objects, theserver can apply a “discount function” to reduce the scores associatedwith objects that are nearer the end of the transaction versus nearerthe end-users current step in the transaction. This is because theprobability of the end-user actually needing these objects diminisheswith the number of page views still remaining in the transaction beforeencountering those objects. In one embodiment the discount function canbe a linear function: for example, if an object is not needed until Nmore steps of the transaction have completed (N pages further in thenavigation sequence), the score on that object can be reduced by k*Npercent, where k might be about 10, or in a range of about 5-25.

On the first push window (loading the base page on the first step of atransaction), the server can choose to push only objects that areexpected to speed up that base page. This is because the initial pushwindow is short, and the server does not yet know that the client isgoing to engage in a transaction at all. The server can rationallyexpect that the push window that opens after the first page has beendelivered but before the second page has been requested will be longer,and can therefore choose to push objects expected to help later phasesof the transaction if the scores so indicated.

When a client requests the base page at step N in the transaction, theserver can choose to eliminate all push candidates that were likelyneeded at a prior step, since the client very likely has these objectsalready. It can apply a different function on the score here than theone described above, preferably based only on freq. For example, supposeobject X is identified as a push candidate at steps 1, 3, 4, and 6 of atransaction, and that the corresponding freq values for object Xappearing on the corresponding pages are f1, f3, f4, and f6. When theend-user client makes the request for step 6 (in other words, aftermaking requests at the prior steps and reaching step/page 6), theprobability that the client already has object X is1−(1−f1)*(1−f3)*(1−f4). Generalizing, the probability can be calculatedas: 1-Π₁ ^(N)(1−freq_(N)) where 1 . . . N are the steps in which theobject X was identified as a push candidate. The server can apply athreshold (e.g., about 20%) to this computation to decide whether topush the object at step 6. In other embodiments, other thresholds mightbe used, e.g., in a range of 10-30%, or otherwise. These arenon-limiting examples.

Identifying a Transaction

Current solutions in the field gather RT data for a subset of all pageviews (e.g., a small percent %), by sampling individual pages at random(without regard to any transaction we might be in). However one canconvert these to a transactional sampling approach. A method forachieving this is as follows: when an end-user visits a page, the serverwill check for the existence of a particular cookie that has domainscope (i.e. applies to the entire site example.com rather than a givenpage). If it does not exist or enough time has passed since that cookiewas last set on this user (e.g., by looking at a timestamp in thecookie), the server will declare this page fetch to be the start of atransaction. It will then decide whether or not it wants to sample thistransaction. It might pick purely at random to achieve a particularsampling rate, or it might use characteristics of the page or theend-user to decide to sample (e.g., due to via customer metadataconfiguration that identifies a landing page, or a customer's preferencefor optimizing performance for specific bodies of end-users, such asthose in a particular geography). If the transaction is selected forsampling, the server sets a cookie with a current timestamp, indicatingthat all page fetches on the given hostname should be sampled for someperiod of time (typically a few minutes) from this end-user. The cookiepreferably includes a transaction identifier (e.g., a constant that isreported back with the beacon data and enables the system 308/310 tocorrelate the data corresponding to a given transaction.)

This sampling approach allows a processing engine to extract the mostcommon transactions from the RT data that has accumulated for a givenhostname or customers. Alternatively, since many content providersalready have web analytics on their sites, one could develop aninterface to these web analytics and extract information about the mostcommon transaction (of course, with authorization from the contentprovider customer).

In an alternative embodiment, rather than transaction sampling approachdescribed above, the system instead employs a ‘master’ push table. Theoffline utility, based on the random individual page samples from adefined scope (which are not necessarily known to be in the sametransaction) builds a “master” table of pushable resources. The definedscope may be all pages under the site domain name, or a subdomain,domain plus pathname, or other partition, but preferably the scope isbroad. The offline utility can rank the resources using the scoringmethodology as described above, resulting in a master table of pushablecandidates that are associated with the entire domain (or other definedscope). As before, this table is communicated from the offline utilityto servers. The push process then proceeds as follows: A server receivesa client request for a base page, page.html, under a domain example.com.The server responds with the base page and after client requests, withembedded resources (although some of which may be pushed according tothe single-page proposal). After this time, preferably after an idletime such as 200 ms, the server pushes top ranked resources for theexample.com domain. While the specific transaction (i.e., web navigationsequence) is not known to the server in this situation, if the clientrequests another page on the example.com domain, the pushed candidatesrepresent the embedded resources most likely to be requested, per themaster table. As those skilled in the art will understand, the logic ofthe approach follows analogously regardless of the scope of the mastertable.

Identifying Which Transaction an End-User is Starting

After identifying the most common transactions based on URLs visited, itbecomes necessary to for the server to identify which transaction anend-user is in before deciding what to push. If the transactionsidentified in the above steps do not share URLs, or do not share URLsearly in the transaction, then the first or second page request issuedby the client might be sufficient to identify the transaction at theserver. If many transactions have a common starting point (e.g., acommon page), then the server has less information upon which to selectobjects for push. In this case, it can identify the full suite ofpossible transactions based on the transaction steps seen so far, andassign a probability to each based on how common that transaction is inthe data as a whole. Then it can identify all pushable objects acrossall such transactions, and discount the object scores (i.e., for a giventransaction) based on the above-computed probability of how common thetransaction is in the data as a whole. From these results it can chooseobjects to push, applying some minimum threshold value (e.g., about 20%,or in other embodiments in a range of about 10-30%, or other value) toavoid pushing any object that is unlikely to be needed.

Computer Based Implementation

The subject matter described herein may be implemented with computersystems, as modified by the teachings hereof, with the processes andfunctional characteristics described herein realized in special-purposehardware, general-purpose hardware configured by software stored thereinfor special purposes, or a combination thereof.

Software may include one or several discrete programs. A given functionmay comprise part of any given module, process, execution thread, orother such programming construct. Generalizing, each function describedabove may be implemented as computer code, namely, as a set of computerinstructions, executable in one or more microprocessors to provide aspecial purpose machine. The code may be executed using conventionalapparatus—such as a microprocessor in a computer, digital dataprocessing device, or other computing apparatus—as modified by theteachings hereof. In one embodiment, such software may be implemented ina programming language that runs in conjunction with a proxy on astandard Intel hardware platform running an operating system such asLinux. The functionality may be built into the proxy code, or it may beexecuted as an adjunct to that code.

While in some cases above a particular order of operations performed bycertain embodiments is set forth, it should be understood that suchorder is exemplary and that they may be performed in a different order,combined, or the like. Moreover, some of the functions may be combinedor shared in given instructions, program sequences, code portions, andthe like. References in the specification to a given embodiment indicatethat the embodiment described may include a particular feature,structure, or characteristic, but every embodiment may not necessarilyinclude the particular feature, structure, or characteristic.

FIG. 5 is a block diagram that illustrates hardware in a computer system500 on which embodiments of the invention may be implemented. Thecomputer system 500 may be embodied in a client device, server, personalcomputer, workstation, tablet computer, wireless device, mobile device,network device, router, hub, gateway, or other device.

Computer system 500 includes a microprocessor 504 coupled to bus 501. Insome systems, multiple microprocessor and/or microprocessor cores may beemployed. Computer system 500 further includes a main memory 510, suchas a random access memory (RAM) or other storage device, coupled to thebus 501 for storing information and instructions to be executed bymicroprocessor 504. A read only memory (ROM) 508 is coupled to the bus501 for storing information and instructions for microprocessor 504. Asanother form of memory, a non-volatile storage device 506, such as amagnetic disk, solid state memory (e.g., flash memory), or optical disk,is provided and coupled to bus 501 for storing information andinstructions. Other application-specific integrated circuits (ASICs),field programmable gate arrays (FPGAs) or circuitry may be included inthe computer system 500 to perform functions described herein.

Although the computer system 500 is often managed remotely via acommunication interface 516, for local administration purposes thesystem 500 may have a peripheral interface 512 communicatively couplescomputer system 500 to a user display 514 that displays the output ofsoftware executing on the computer system, and an input device 515(e.g., a keyboard, mouse, trackpad, touchscreen) that communicates userinput and instructions to the computer system 500. The peripheralinterface 512 may include interface circuitry and logic for local busessuch as Universal Serial Bus (USB) or other communication links.

Computer system 500 is coupled to a communication interface 516 thatprovides a link between the system bus 501 and an external communicationlink. The communication interface 516 provides a network link 518. Thecommunication interface 516 may represent an Ethernet or other networkinterface card (NIC), a wireless interface, modem, an optical interface,or other kind of input/output interface.

Network link 518 provides data communication through one or morenetworks to other devices. Such devices include other computer systemsthat are part of a local area network (LAN) 526. Furthermore, thenetwork link 518 provides a link, via an internet service provider (ISP)520, to the Internet 522. In turn, the Internet 522 may provide a linkto other computing systems such as a remote server 530 and/or a remoteclient 531. Network link 518 and such networks may transmit data usingpacket-switched, circuit-switched, or other data-transmissionapproaches.

In operation, the computer system 500 may implement the functionalitydescribed herein as a result of the microprocessor executing programcode. Such code may be read from or stored on memory 510, ROM 508, ornon-volatile storage device 506, which may be implemented in the form ofdisks, tapes, magnetic media, CD-ROMs, optical media, RAM, PROM, EPROM,and EEPROM. Any other non-transitory computer-readable medium may beemployed. Executing code may also be read from network link 518 (e.g.,following storage in an interface buffer, local memory, or othercircuitry).

A client device may be a conventional desktop, laptop or otherInternet-accessible machine running a web browser or other renderingengine, but as mentioned above a client may also be a mobile device. Anywireless client device may be utilized, e.g., a cellphone, pager, apersonal digital assistant (PDA, e.g., with GPRS NIC), a mobile computerwith a smartphone client, tablet or the like. Other mobile devices inwhich the technique may be practiced include any access protocol-enableddevice (e.g., iOS™-based device, an Android™-based device, othermobile-OS based device, or the like) that is capable of sending andreceiving data in a wireless manner using a wireless protocol. Typicalwireless protocols include: WiFi, GSM/GPRS, CDMA or WiMax. Theseprotocols implement the ISO/OSI Physical and Data Link layers (Layers 1& 2) upon which a traditional networking stack is built, complete withIP, TCP, SSL/TLS and HTTP. The WAP (wireless access protocol) alsoprovides a set of network communication layers (e.g., WDP, WTLS, WTP)and corresponding functionality used with GSM and CDMA wirelessnetworks, among others.

In a representative embodiment, a mobile device is a cellular telephonethat operates over GPRS (General Packet Radio Service), which is a datatechnology for GSM networks. Generalizing, a mobile device as usedherein is a 3G—(or next generation) compliant device that includes asubscriber identity module (SIM), which is a smart card that carriessubscriber-specific information, mobile equipment (e.g., radio andassociated signal processing devices), a man-machine interface (MMI),and one or more interfaces to external devices (e.g., computers, PDAs,and the like). The techniques disclosed herein are not limited for usewith a mobile device that uses a particular access protocol. The mobiledevice typically also has support for wireless local area network (WLAN)technologies, such as Wi-Fi. WLAN is based on IEEE 802.11 standards. Theteachings disclosed herein are not limited to any particular mode orapplication layer for mobile device communications.

It should be understood that the foregoing has presented certainembodiments of the invention that should not be construed as limiting.For example, certain language, syntax, and instructions have beenpresented above for illustrative purposes, and they should not beconstrued as limiting. It is contemplated that those skilled in the artwill recognize other possible implementations in view of this disclosureand in accordance with its scope and spirit. The appended claims definethe subject matter for which protection is sought.

It is noted that trademarks appearing herein are the property of theirrespective owners and used for identification and descriptive purposesonly, given the nature of the subject matter at issue, and not to implyendorsement or affiliation in any way.

1. A method performed by one or more computing machines, comprising:identifying a particular object X as a push candidate for each of aplurality of pages 1 . . . N, where each page represents a step in atransaction; determining a frequency with which a particular resourceassociated with a given page is loaded by a plurality of clients, so asto produce freq₁ . . . freq_(N); receiving requests from a particularclient for each of the plurality of pages 1 . . . N; receiving a requestfrom the particular client for a particular page that comes after theplurality of pages 1 . . . N in the transaction; computing a probabilitythat the particular client already has the particular object;determining whether to push the particular object to the particularclient, in response to the request for the particular page, based atleast in part on the probability; when the determination is to push theparticular object, pushing the particular object to the particularclient in response to the particular request; when the determination isnot to push the particular object, not pushing the particular object tothe particular client in response to the particular request
 2. Themethod of claim 1, wherein determining whether to push the particularobject to the particular client, in response to the request for theparticular page, based at least in part on the probability comprisescomparing the probability to a threshold.
 3. The method of claim 1,further comprising: collecting, from the plurality of clients, resourcetiming data for the given page over a plurality of client page loads, todetermine the frequencies.
 4. (canceled)
 5. The method of claim 1,wherein the transaction is a web page navigation sequence in ane-commerce transaction.
 6. A computer apparatus having at least onemicroprocessor and memory storing computer-readable instructions forexecution on the at least one microprocessor, the instructionscomprising: instructions for identifying a particular object X as a pushcandidate for each of a plurality of pages 1 . . . N, where each pagerepresents a step in a transaction; instructions for determining afrequency with which a particular object associated with a given page isloaded by a plurality of clients, so as to produce freq₁ . . . freq_(N);instructions for receiving requests from a particular client for each ofthe plurality of pages 1 . . . N; instructions for receiving a requestfrom the particular client for a particular page that comes after theplurality of pages 1 . . . N in the transaction; instructions forcomputing a probability that the particular client already has theparticular object cached; instructions for determining whether to pushthe particular object to the particular client, in response to therequest for the particular page, based at least in part on theprobability; when the determination is to push the particular object,pushing the particular object to the particular client in response tothe particular request; when the determination is not to push theparticular object, not pushing the particular object to the particularclient in response to the particular request
 7. The computer apparatusof claim 6, wherein determining whether to push the particular object tothe particular client, in response to the request for the particularpage, based at least in part on the probability comprises comparing theprobability to a threshold.
 8. The computer apparatus of claim 6,further comprising: instructions for collecting, from the plurality ofclients, resource timing data for the given page over a plurality ofclient page loads, to determine the frequencies.
 9. (canceled)
 10. Thecomputer apparatus of claim 6, wherein the transaction is a web pagenavigation sequence in an e-commerce transaction.
 11. A method,comprising: with a computer system, for each of a plurality of pagesassociated with a particular web navigation sequence: collecting, from aplurality of clients, resource timing data for the given page over aplurality of client page loads; determining scores for a set of one ormore objects for pushing from a client from a server, based on afunction of the resource timing data; sending the scores to a serverdistinct from the computer system; at the server: receiving a requestfor HTML associated with a given page in the plurality of pages, from aclient, where the given page is common to a plurality of web navigationsequences; determining a probability that a client device is beginningthe particular web navigation sequence associated with the plurality ofpages; discounting a score associated with at least one of the one ormore objects based on the probability that the client is beginning theparticular web navigation sequence.
 12. The method of claim 11, whereindetermining the probability comprises identifying possible webnavigation sequences based on the page requests seen so far, andassigning a probability to the web navigation sequence based on howcommon that web navigation sequence is. 13.-14. (canceled)